Speaker adaptation using combined transformation and Bayesian methods
نویسندگان
چکیده
Adapting the parameters of a statistical speaker-independent continuous-speech recognizer to the speaker and the channel can significantly improve the recognition performance and robustness of the system. In continuous mixture-density hidden Markov models the number of component densities is typically very large, and it may not be feasible to acquire a sufficient amount of adaptation data for robust maximum-likelihood estimates. To solve this problem, we have recently proposed a constrained estimation technique for Gaussian mixture densities. To improve the behavior of our adaptation scheme for large amounts of adaptation data, we combine it here with Bayesian techniques. We evaluate our algorithms on the large-vocabulary Wall Street Journal corpus for nonnative speakers of American English. The recognition error rate is approximately halved with only a small amount of adaptation data, and it approaches the speaker-independent accuracy achieved for native speakers. V. Digalakis L. Neumeyer TEL +30-821-46566 x226 TEL +1-415-859-4522 FAX +30-821-58708 FAX +1-415-859-5984 [email protected] [email protected] Electronic and Computer Engineering Dept., Technical University of Crete Kounoupidiana, Chania, 73100 GREECE SRI International 333 Ravenswood Ave. Menlo Park, CA 94025, USA
منابع مشابه
Robust Speaker Clustering in Eigenspace
In this paper we propose a speaker clustering scheme working in ’Eigenspace’. Speaker models are transformed to a low-dimensional subspace using ’Eigenvoices’. For the speaker clustering procedure simple distance measures, e.g. Euklidean distance can be applied. Moreover, clustering can be accomplished with base models (for Eigenvoice projection) like Gaussian Mixture Models as well as conventi...
متن کاملACOUSTIC MODEL ADAPTATION FOR AUTOMATIC SPEECH RECOGNITION AND ANIMAL VOCALIZATION CLASSIFICATION by
ACOUSTIC MODEL ADAPTATION FOR AUTOMATIC SPEECH RECOGNITION AND ANIMAL VOCALIZATION CLASSIFICATION Jidong Tao, B.Eng., M.S. Marquette University, 2009 Automatic speech recognition (ASR) converts human speech to readable text. Acoustic model adaptation, also called speaker adaptation, is one of the most promising techniques in ASR for improving recognition accuracy. Adaptation works by tuning a g...
متن کاملImproved Bayesian learning of hidden Markov models for speaker adaptation
We propose an improved maximum a posteriori (MAP) learning algorithm of continuous-density hidden Markov model (CDHMM) parameters for speaker adaptation. The algorithm is developed by sequentially combining three adaptation approaches. First, the clusters of speaker-independent HMM parameters are locally transformed through a group of transformation functions. Then, the transformed HMM paramete...
متن کاملOnline Bayesian tree-structured transformation of HMMs with optimal model selection for speaker adaptation
This paper presents a new recursive Bayesian learning approach for transformation parameter estimation in speaker adaptation. Our goal is to incrementally transform or adapt a set of hidden Markov model (HMM) parameters for a new speaker and gain large performance improvement from a small amount of adaptation data. By constructing a clustering tree of HMM Gaussian mixture components, the linear...
متن کاملOn-line Bayesian speaker adaptation using tree-structured transformation and robust priors
This paper presents new results by using our recently proposed on-line Bayesian learning approach for affine transformation parameter estimation in speaker adaptation. The on-line Bayesian learning technique allows updating parameter estimates after each utterance and i t can accommodate flexible forms of transformation functions as well as prior probability density function. We show through ex...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IEEE Trans. Speech and Audio Processing
دوره 4 شماره
صفحات -
تاریخ انتشار 1995